32 research outputs found

    Incremental schema integration for data wrangling via knowledge graphs

    Get PDF
    Virtual data integration is the current approach to go for data wrangling in data-driven decision-making. In this paper, we focus on automating schema integration, which extracts a homogenised representation of the data source schemata and integrates them into a global schema to enable virtual data integration. Schema integration requires a set of well-known constructs: the data source schemata and wrappers, a global integrated schema and the mappings between them. Based on them, virtual data integration systems enable fast and on-demand data exploration via query rewriting. Unfortunately, the generation of such constructs is currently performed in a largely manual manner, hindering its feasibility in real scenarios. This becomes aggravated when dealing with heterogeneous and evolving data sources. To overcome these issues, we propose a fully-fledged semi-automatic and incremental approach grounded on knowledge graphs to generate the required schema integration constructs in four main steps: bootstrapping, schema matching, schema integration, and generation of system-specific constructs. We also present NextiaDI, a tool implementing our approach. Finally, a comprehensive evaluation is presented to scrutinize our approach.This work was partly supported by the DOGO4ML project, funded by the Spanish Ministerio de Ciencia e Innovación under project PID2020-117191RB-I00, and D3M project, funded by the Spanish Agencia Estatal de Investigación (AEI) under project PDC2021-121195-I00. Javier Flores is supported by contract 2020-DI-027 of the Industrial Doctorate Program of the Government of Catalonia and Consejo Nacional de Ciencia y Tecnología (CONACYT, Mexico). Sergi Nadal is partly supported by the Spanish Ministerio de Ciencia e Innovación, as well as the European Union – NextGenerationEU, under project FJC2020-045809-I.Peer ReviewedPostprint (published version

    Extraction and representation of visual content semantics

    No full text
    Digital multimedia content is omnipresent on the Web; Google posted on August 2005, a total image size of 2; 187; 212; 422, Yahoo estimated that its index covered 1:5 billion of images at that time, while nowadays statistics show a continuous growth in these numbers (indicatively Flickr uploads amount to an average of about 3000 images per minute). Given such numbers, the availability of machine processable semantic descriptions for this content becomes a key factor for the realisation of applications of practical interest, perpetuating the challenge of what constitutes the multimedia community holy grail, i.e. the semantic gap between representations that can be automatically extracted and the underlying meaning. In the late 1970s and early 1980s, inuenced by the AI paradigm, the analysis and understanding of audiovisual content became a problem of achieving intelligent behaviour by simulating what humans know through computational means. Hence, the rst attempts towards knowledge-directed image analysis emerged. A period of explosive growth in approaches conditioned by knowledge followed: varying knowledge representation and reasoning schemes, in accordance with the contemporary AI assets, were proposed, and knowledge attempted to address all aspects involved, ranging from perceptual characteristics of the visual manifestations to control strategies. The broad and ambitious scope targeted by the use of knowledge, resulted in representations and reasoning mechanisms that exhibited high complexity and inexibility, while the lack of well-founded semantics further reduced e#cacy and interoperability. Research focus shifted to machine learning, which gained particular popularity as means for capturing knowledge that cannot be represented e ectively or explicitly. Recent analysis in multimedia has reached a point where detectors can be learned in a generic fashion for a signi cant number of conceptual entities. The obtained performance however exhibits versatile behaviour, reecting implications over the training set selection, similarities in visual manifestations of distinct conceptual entities, and appearance variations of the conceptual entities. A factor partially accountable for these limitations relates to the fact that machine learning techniques realise the transition from visual features to conceptual entities based solely on information regarding perceptual features. Hence, a signi cant part of the knowledge pertaining to the semantics underlying the interpretation is missed. The advent of the Semantic Web paved a new era in knowledge sharing, reuse and interoperability, by making formal semantics explicit and machine understandable rather than just machine processable. The multimedia community embraced the new technologies, utilising ontologies at rst in order to attach explicit meaning to the produced annotations (at the content and the media layers), and subsequently as means for assisting the very extraction of the annotations. The state of the art with respect to the latter approaches is characterised by particular features, among which the poor handling of uncertainty, the restricted utilisation of formal semantics and inference services, and by focus on representing associations between perceptual features and domain entities rather than logical relations between the domain entities or on modelling analysis aspects. This thesis addresses the problem of how enhanced semantic descriptions of visual content may be automatically derived through the utilisation of formal semantics and reasoning, and how the domain speci c descriptions can be transparently integrated with media related ones referring to the structure of the content. The central contributions of the thesis lie in: i) the de nition of a uni ed representation of the domain speci c knowledge required for the extraction of semantics and of the analysis speci c knowledge that implements the process of extraction, ii) the development of a formal reasoning framework that supports uncertainty handling for the purpose of the semantic integration and enrichment of initial descriptions deriving from di erent analysis systems, and iii) the de nition of an MPEG-7 compliant ontology that formally captures the structure of multimedia content allowing for precise semantics and for serving as means for the de nition of mappings between the existing ontologies addressing multimedia content structural aspects. The rst refers to a uni ed ontology-based knowledge representation framework that allows one to model the process of extracting semantic descriptions in accordance to perceptual and conceptual aspects of the knowledge characterising the speci c domain. The use of ontologies for both knowledge components enhances the potential of sharing and reuse of the respective components, but most importantly enables the extensibility of the framework to other application domains and its sharing across di erent systems. More speci cally, semantic concepts in the context of the examined domain are de ned in an ontology, enriched with qualitative attributes (e.g., color homogeneity), low-level features (e.g., color model components distribution), object spatial relations, and multimedia processing methods (e.g., color clustering). The RDF(S) language has been used for the representation of the developed domain and analysis ontologies, while for the rules that determine how tools for multimedia analysis should be applied depending on concept attributes and low-level features, are expressed in F-Logic. The second part of the contribution refers to the development of a fuzzy DL-based reasoning framework in order to integrate image annotations at scene and region level, into a semantically consistent nal description, further enhanced by means of inference. The use of fuzzy DLs semantics allow to formally handle the uncertainty that charasterises multimedia analysis and understanding, while the use of DLs allows to bene t from the high expressivity and the e#cient reasoning algorithms in the management of the domain speci c semantics. The initial annotations forming the input may come from di erent modalities and analysis implementations, and their degrees can be re-adjusted using weights to specify the reliability of the corresponding analysis technique or modality. Finally, the third part tackles the engineering of a multimedia ontology, and more speci cally of one addressing aspects related to the structure and decomposition schemes of multimedia content. The existing MPEG-7 based ontologies despite induced by the need for formal descriptions and precise semantics raise new interoperability issues as they build on di erent rationales and set to serve varying roles. The ontology developed within the context of this thesis re-engineers part of the MPEG-7 speci cations to ensure precise semantics and transparency of meanin

    FORGe at WebNLG 2017

    No full text
    This paper describes the FORGe generator at E2E. The input triples are mapped onto sentences by applying a series of rule-based graph-transducers and aggregation grammars to template predicate-argument structures associated to each property. We submitted two primary systems to the task, one based on the grammars, and one based on templates, and one secondary system, which is a variation of the grammar-based one

    FORGe at E2E 2017

    No full text
    This paper describes the FORGe generator at WebNLG. The input DBpedia triples are mapped onto sentences by applying a series of rule-based graph-transducers and aggregation grammars to template predicate-argument structures associated to each property

    Ontology-based trajectory analysis for semantic event detection

    Get PDF
    The extraction of human centered descriptions, matching end users cognition, and specifically the detection and identification of events in videos is a particularly challenging problem, due to the volume and diversity of both the automatically extracted low-level features and the corresponding high-level information conveyed. Numerous efforts have begun, attempting to bridge the semantic gap between lowlevel data and higher level descriptions, often resorting to domain-specific learning-based approaches. In this paper we present a novel, generally applicable approach, for hierarchical semantic analysis of spatiotemporal video features (trajectories) in order to localize and detect events of interest. Dynamically changing trajectories are extracted by processing the optical flow, based on its statistics. The temporal evolution of the trajectories ’ geometrical and spatiotemporal characteristics forms the basis on which event detection is performed. This is based on the exploitation of prior knowledge, which provides the formal conceptualization needed to enable the automatic inference of high level event descriptions. Experimental results with a variety of surveillance videos are presented to exemplify the usability and effectiveness of the proposed system. 1

    Question answering over pattern-based user models

    No full text
    Comunicació presentada a la 12th International Conference on Semantic Systems celebrada a Leipzig, Alemanya, del 13-14 de setembre de 2016.In this paper we present an ontology-driven framework for natural language question analysis and answering over user models (e.g. preferences, habits and health problems of individuals) that are formally captured using ontology design patterns. Pattern-based modelling is extremely useful for capturing n-ary relations in a well-defined and axiomatised manner, but it introduces additional challenges in building NL interfaces for accessing the underlying content. This is mainly due to the encapsulation of domain semantics inside conceptual layers of abstraction (e.g. using reification or container classes) that demand flexible, context-aware approaches for query analysis and interpretation. We describe the coupling of a frame-based formalisation of natural language user utterances with a context-aware query interpretation towards question answering over pattern-based RDF knowledge bases. The proposed framework is part of a human-like socially communicative agent that acts as an intermediate between elderly migrants and care personnel, assisting the latter to solicit personal information about care recipients (e.g. medical history, care needs, preferences, routines, habits, etc.).This work has been partially supported by the H2020-645012 project /KRISTINA: A Knowledge-Based Information Agent with Social Competence and Human Interaction Capabilities"

    Semantic web technologies in pervasive computing : a survey and research roadmap

    Get PDF
    This work has been supported by the EU FP7 projects SAPERE: Self-aware Pervasive Service Ecosystems under Contract No. 256873 and Dem@Care: Dementia Ambient Care — Multi-Sensing Monitoring for Intelligent Remote Management and Decision Support under Contract No. 288199.Pervasive and sensor-driven systems are by nature open and extensible, both in terms of input and tasks they are required to perform. Data streams coming from sensors are inherently noisy, imprecise and inaccurate, with differing sampling rates and complex correlations with each other. These characteristics pose a significant challenge for traditional approaches to storing, representing, exchanging, manipulating and programming with sensor data. Semantic Web technologies provide a uniform framework for capturing these properties. Offering powerful representation facilities and reasoning techniques, these technologies are rapidly gaining attention towards facing a range of issues such as data and knowledge modelling, querying, reasoning, service discovery, privacy and provenance. This article reviews the application of the Semantic Web to pervasive and sensor-driven systems with a focus on information modelling and reasoning along with streaming data and uncertainty handling. The strengths and weaknesses of current and projected approaches are analysed and a roadmap is derived for using the Semantic Web as a platform, on which open, standard-based, pervasive, adaptive and sensor-driven systems can be deployed.PostprintPeer reviewe
    corecore